Skip to content

Zero-initialize is_quoted_flags buffer in the CSV reader#22386

Open
vuule wants to merge 3 commits intorapidsai:mainfrom
vuule:bug-read_csv-uninit-flags
Open

Zero-initialize is_quoted_flags buffer in the CSV reader#22386
vuule wants to merge 3 commits intorapidsai:mainfrom
vuule:bug-read_csv-uninit-flags

Conversation

@vuule
Copy link
Copy Markdown
Contributor

@vuule vuule commented May 5, 2026

Description

Fixes initcheck errors in the CSV reader.

Remaining errors are false positives caused by cudaMemcpyBatchAsync use.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 5, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions Bot added the libcudf Affects libcudf (C++/CUDA) code. label May 5, 2026
@vuule vuule added bug Something isn't working non-breaking Non-breaking change labels May 5, 2026
@vuule vuule marked this pull request as ready for review May 5, 2026 23:14
@vuule vuule requested a review from a team as a code owner May 5, 2026 23:14
@vuule vuule requested review from bdice and mhaseeb123 May 5, 2026 23:14
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5699ff07-6af6-4d4d-83ca-67db6d58ff7a

📥 Commits

Reviewing files that changed from the base of the PR and between 16c6356 and c27c281.

📒 Files selected for processing (1)
  • cpp/src/io/csv/reader_impl.cu

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes
    • Improved CSV reader's GPU memory allocation and initialization for quoted field tracking in string columns, ensuring proper buffer setup during data decoding.

Walkthrough

The CSV reader's decode_data function changes memory allocation for quoted-field tracking buffers. String columns now allocate is_quoted_flags using a zero-initialized device uvector helper with explicit async stream and device resource references, replacing the prior non-zeroed constructor approach.

Changes

CSV Quoted-Flags GPU Memory Allocation

Layer / File(s) Summary
GPU Memory Allocation for Quoted Flags
cpp/src/io/csv/reader_impl.cu
The is_quoted_flags_storage allocation for string columns is replaced with a zero-initialized device uvector helper using the active CUDA stream and get_current_device_resource_ref(), ensuring quoted-flag buffers start as zeroed values before downstream decoding logic writes to them.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Zero-initialize is_quoted_flags buffer in the CSV reader' directly and specifically describes the main change: zero-initializing a GPU buffer in the CSV reader implementation.
Description check ✅ Passed The description is related to the changeset, explaining that it fixes initcheck errors by zero-initializing the is_quoted_flags buffer, which matches the code changes and PR objectives.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vuule
Copy link
Copy Markdown
Contributor Author

vuule commented May 7, 2026

/ok to test c27c281

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants